Introduction¶
Trees play a pivotal role in upholding a city's well-being, adorning its streets, parks, and private spaces. Delving into the statistics of urban street trees yields valuable insights for enthusiasts, city planners, and the public. This project explores street trees in Vancouver, British Columbia, investigating the relationship between height and diameter, assessing the prevalence of different species and genera, and identifying neighbourhoods with diverse tree characteristics. These characteristics include the largest variety of genera, the widest range of diameters, the thickest tree, and the highest tree density.
The largest trees documented in Vancouver are typically found in parks, but this project diverges in its focus exclusively on street trees. Matasci et al. (2018) visualized attributes of Vancouver trees, but their scope was limited to park trees exceeding 15 meters in height. Galle et al. (2021) conducted a broader examination of street tree biodiversity across eight cities, Vancouver included; however, their study specifically addressed biodiversity statistics.
The dataset utilized in this project constitutes a subset of publicly available information released by the City of Vancouver. Crafted by UBC staff, this subset comprises 5,000 observations selected from a pool of over 150,000 data points. The project was executed using a Jupyter notebook and the Python programming language, incorporating libraries such as Altair for visualization and Pandas for data manipulation.
Analysis¶
Table 1. Column Names¶
| Column | Explanation | Unit |
|---|---|---|
| DIAMETER | Diameter at breast height (DBH) in inches | Inches |
| HEIGHT_RANGE_ID | Height range in feet (e.g., 0 = 0-10 ft, 1 = 10-20 ft) | NA |
| NEIGHBOURHOOD_NAME | City-defined local area where the tree is located | NA |
| GENUS_NAME | Genus name | NA |
| SPECIES_NAME | Species name | NA |
| COMMON_NAME | Common name | NA |
The descriptions are also found in the dataset schema of the City of Vancouver website.
Number of null values and type of each column:
<class 'pandas.core.frame.DataFrame'> RangeIndex: 5000 entries, 0 to 4999 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Unnamed: 0 5000 non-null int64 1 std_street 5000 non-null object 2 on_street 5000 non-null object 3 species_name 5000 non-null object 4 neighbourhood_name 5000 non-null object 5 date_planted 2363 non-null datetime64[ns] 6 diameter 5000 non-null float64 7 street_side_name 5000 non-null object 8 genus_name 5000 non-null object 9 assigned 5000 non-null object 10 civic_number 5000 non-null int64 11 plant_area 4950 non-null object 12 curb 5000 non-null object 13 tree_id 5000 non-null int64 14 common_name 5000 non-null object 15 height_range_id 5000 non-null int64 16 on_street_block 5000 non-null int64 17 cultivar_name 2658 non-null object 18 root_barrier 5000 non-null object 19 latitude 5000 non-null float64 20 longitude 5000 non-null float64 dtypes: datetime64[ns](1), float64(3), int64(5), object(12) memory usage: 820.4+ KB
Date_planted and cultivar_name are missing more than 50% of the values and will not be included in the rest of this project. The missing values are random and bear no association with any other variable in the data. An analysis of the null values shows that the majority have a diameter less than 30 inches, but this may be associated with the fact that most trees in the dataset are also below 30 inches in diameter. Also, the missing values span over all height ranges with the bulk from 20 to 50 feet. Finally, a stroke chart shows no clear pattern between the missing values and all other variables.
A description of the numeric values:
| Unnamed: 0 | date_planted | diameter | civic_number | tree_id | height_range_id | on_street_block | latitude | longitude | |
|---|---|---|---|---|---|---|---|---|---|
| count | 5000.000000 | 2363 | 5000.000000 | 5000.000000 | 5000.000000 | 5000.00000 | 5000.000000 | 5000.000000 | 5000.000000 |
| mean | 14861.920400 | 2003-09-06 04:03:08.912399488 | 12.340888 | 2975.707600 | 128682.584600 | 2.73440 | 2960.227000 | 49.247349 | -123.107128 |
| min | 2.000000 | 1989-10-31 00:00:00 | 0.000000 | 2.000000 | 36.000000 | 0.00000 | 0.000000 | 49.202783 | -123.220560 |
| 25% | 7192.750000 | 1997-11-06 00:00:00 | 4.000000 | 1300.500000 | 61321.500000 | 2.00000 | 1300.000000 | 49.230152 | -123.144178 |
| 50% | 14870.000000 | 2003-02-12 00:00:00 | 10.000000 | 2639.000000 | 130130.500000 | 2.00000 | 2600.000000 | 49.247981 | -123.105861 |
| 75% | 22366.750000 | 2009-11-17 00:00:00 | 18.000000 | 4123.000000 | 191332.000000 | 4.00000 | 4100.000000 | 49.263275 | -123.063484 |
| max | 29992.000000 | 2019-05-07 00:00:00 | 71.000000 | 9113.000000 | 270750.000000 | 9.00000 | 9100.000000 | 49.293930 | -123.023311 |
| std | 8680.023278 | NaN | 9.266600 | 2078.580429 | 75412.260406 | 1.56957 | 2086.861052 | 0.021251 | 0.049137 |
The oldest tree in the dataset was planted in 1989 and the youngest in 2019—a span of almost 30 years. The numerical columns that relate to the questions of interest are diameter and height range. The diameter ranges from 0 to 71 inches and the height ranges from id 0-9: less than 10 feet to 100 feet. Running an analysis on the count distribution of trees, based on year planted, shows that most trees were planted between 1992 to 2014. There are less than 40 trees planted annually after 2014 and before 1992.
The below tables show the most recurring genera and species.
| genus_name | frequency | |
|---|---|---|
| 0 | ACER | 1218 |
| 1 | PRUNUS | 1050 |
| 2 | TILIA | 238 |
| 3 | FRAXINUS | 238 |
| 4 | QUERCUS | 218 |
| 5 | CARPINUS | 188 |
| 6 | FAGUS | 174 |
| 7 | MALUS | 151 |
| 8 | MAGNOLIA | 139 |
| 9 | CRATAEGUS | 134 |
| genus_name | species_name | frequency | |
|---|---|---|---|
| 0 | PRUNUS | SERRULATA | 463 |
| 1 | ACER | PLATANOIDES | 444 |
| 2 | PRUNUS | CERASIFERA | 396 |
| 3 | ACER | RUBRUM | 261 |
| 4 | CARPINUS | BETULUS | 170 |
| 5 | FAGUS | SYLVATICA | 167 |
| 6 | TILIA | EUCHLORA X | 152 |
| 7 | ACER | FREEMANI X | 127 |
| 8 | ACER | CAMPESTRE | 124 |
| 9 | MAGNOLIA | KOBUS | 93 |
The most frequent genera are ACER and PRUNUS, and the most frequent species are PRUNUS SERRULATA, ACER PLATANOIDES, PRUNUS CERASIFERA, and ACER RUBRUM.
The literature shows a positive relationship between height and diameter of trees. The visual below explores the relationship of these two variables and displays each data point's plant year, neighbourhood, common name, and URL link. The drop-down filters by genus and the line connects the median diameter of every height range category. The visual is interactive so clicking, zooming, and panning can be done and a double click on the white space returns the chart to full view.
There is a positive relationship between diameter and height. There are less than 10 observations for height id nine: the median diameter drops from eight to nine could be attributed to small data that fail to capture a true representation. The median diameter is 35 inches for trees between 80 and 90 feet. The thickest tree is in Kitsilano and is CEDRUS DEODARA.
The chart below explores the neighbourhood with the largest variety of diameters. The diameters are binned by 10 inches and the size of the points are directly related to the tree count—larger circles indicate a higher count. The color scale is from lightest to darkest in descending order. This chart is unique, because until May 2023, no other paper explores the street trees' diameter distribution by neighbourhood. This chart is also interactive and the radio buttons filter by height range.
Trees below 10 inches are the most common, most trees are below 50 inches, and the tree count decreases as the diameter increases. Shaughnessy and Kitsilano have the largest range of diameters.
The visual below explores the neighbourhood with the widest distribution of genera and displays neighbourhoods and genus count. The map is supplemented with a bar chart showing the neighbourhoods in descending order of genus count. Both charts are interactive and show more information when hovering the mouse over the bars and neighbourhoods.
Renfrew-Collingwood is the winner. West End has half the genus count, at 23 unique genera. Strathcona has the lowest.
The scatter plot below, demonstrates the neighbourhood with the highest average diameter and with the highest density. The two charts interact with each other—clicking on a neighbourhood on the map highlights the data point on the scatter plot, and vice versa. A double click on the white area of the map or scatterplot removes the neighbourhood or circle selection.
Dunbar-Southlands has the largest mean diameter and Renfrew-Collingwood has the highest tree count. The color of the circles gets dark from the bottom to the top of the chart, indicating that genus count increases as tree count increases. Further analysis, using a scatter plot of the two variables, indicates a strong positive relationship between tree count (density) and genus count. This entails that the City of Vancouver diversifies its genus choices when planting more trees.
The above chart gives a visualization of the most common thickness of the three most recurring genera, in Vancouver. PRUNUS has a bimodal distribution peaking around five and 15 inches; ACER peaks around five inches; and TILIA at around 15 inches.
Discussion¶
There is a direct relationship between height and diameter, as corroborated by existing literature. Shaughnessy and Kitsilano stand out as neighbourhoods with the most diverse diameter range, yet most trees surpassing 50 inches in diameter are represented by outliers. In a separate investigation of street trees in New York State, USA, it was discovered that fewer than three percent of street trees exceed 106.7 centimeters in diameter, equivalent to 42 inches (Cowett et al., 2014). This aligns with the results obtained in the current project.
Renfrew-Collingwood boasts the highest variety of genera, reflected in the highest unique genus count, along with the densest population of 384 trees. In terms of mean diameter, Dunbar-Southlands takes the lead with 14 inches. The prevailing species in Vancouver, with 463 instances out of the total 5000 trees, is PRUNUS SERRULATA—a result not surprising given its widespread presence on social media during spring. Notably, the dataset highlights Kitsilano as home to the thickest tree, identified as CEDRUS DEODORA.
Contrary to expectations, downtown Vancouver hosts over 150 trees, challenging the assumption of a lower tree density in a city center. Exploring the correlation between tree age, diameter, and height for future research could uncover the whereabouts of Vancouver's oldest street tree.
The project's strengths lie in its substantial sample size and the innovative exploration of specific questions, complemented by visually presented data. A notable limitation is the sample size itself; utilizing the entire dataset from the City of Vancouver database would likely enhance the accuracy of the findings.
Dashboard¶
The dashboard below brings the four visualizations together. Figure 3A and 4—the top two plots—interact with each other. Figure 2—the bottom left—interacts with the radio buttons for height id. Figure 1—the bottom right—interacts with the drop-down menu for genus selection. Use the mouse to hover over the charts and to click on neighbourhoods in figure 3A and on points in figure 4 and figure 1. Zoom and pan the bottom two charts, as needed.
References¶
City of Vancouver. (2023, May 27). Street trees. - City of Vancouver Open Data Portal. https://opendata.vancouver.ca/explore/dataset/street-trees/information/?disjunctive.species_name&disjunctive.common_name&disjunctive.height_range_id&disjunctive.on_street&disjunctive.neighbourhood_name
Cowett, F. D., & Bassuk, N. L. (2014). Statewide assessment of street trees in New York State, USA. Urban Forestry & Urban Greening, 13(2), 213–220. https://doi.org/10.1016/j.ufug.2014.02.001
Galle, N. J., Halpern, D., Nitoslawski, S., Duarte, F., Ratti, C., & Pilla, F. (2021). Mapping the diversity of street tree inventories across eight cities internationally using open data. Urban Forestry & Urban Greening, 61, 127099. https://doi.org/10.1016/j.ufug.2021.127099
Matasci, G., Coops, N. C., Williams, D. A., & Page, N. (2018). Mapping tree canopies in urban environments using airborne laser scanning (ALS): A vancouver case study. Forest Ecosystems, 5(1). https://doi.org/10.1186/s40663-018-0146-y
University of British Columbia. (n.d.). Data Visualization · course 3 of UBC’s key capabilities in Data Science Program. Data Visualization. https://viz-learn.mds.ubc.ca/en